RAW SEQUENCE LISTING 



The Biotechnology Systems Branch of the Scientific and Technical 
Information Center (STIC) no errors detected. 



Application Serial Number: Q ^/^J^ Q/ ^ 

Source: 't^^fr/A) f /n ' .~Z 

Date Processed by STIC: f HY 



ENTERED 



RAW SEQUENCE LISTING DATE: 12/14/2004 

PATENT APPLICATION: US/0 9/930, 44 OC TIME: 14:00:39 



Input Set : A:\03940077pa.txt 

Output Set: N:\CRF4\12142004\l930440C.raw 



3 <110> APPLICANT: Betenbaugh, Michael J. 

4 Lawrence, Shawn J. 

5 Lee , Yuan C , 

6 Coleman, Timothy A, 

8 <120> TITLE OF INVENTION: Engineering Intracellular Si 
10 <130> FILE REFERENCE: 03940077pa 

12 <140> CURRENT APPLICATION NUMBER: 09/930, 440C 

13 <141> CURRENT FILING DATE: 2001-08-16 

15 <150> PRIOR APPLICATION NUMBER: US 60/122,582 

16 <151> PRIOR FILING DATE: 1999-03-02 

18 <150> PRIOR APPLICATION NUMBER: US 60/169,624 

19 <151> PRIOR FILING DATE: 1999-12-08 

21 <150> PRIOR APPLICATION NUMBER: US 60/227,579 

22 <151> PRIOR FILING DATE: 2000-08-25 

24 <150> PRIOR APPLICATION NUMBER: US 09/516,793 

25 <151> PRIOR FILING DATE: 2000-03-01 
27 <160> NUMBER OF SEQ ID NOS : 18 

29 <170> SOFTWARE: Patentin version 3.2 

31 <210> SEQ ID NO: 1 

32 <211> LENGTH: 1429 

33 <212> TYPE: DNA 

34 <213> ORGANISM: Homo sapiens 
36 <400> SEQUENCE: 1 

3 7 atggccttcc caaagaagaa acttcagggt cttgtggctg caaccatcac 
3 9 gagaatggag aaatcaactt ttcagtaatt ggtcagtatg tggattatct 
41 cagggagtga agaacatttt tgtgaatggc acaacaggag aaggcctgtc 
43 tcagagcgtc gccaggttgc agaggagtgg gtgacaaaag ggaaggacaa 
45 gtgataattc acgtaggagc actgagcttg aaggagtcac aggaactggc 
47 gcagaaatag gagctgatgg catcgctgtc attgcaccgt tcttcctcaa 
49 aaagatatcc tgattaattt cctaaaggaa gtggctgctg ccgcccctgc 
51 tattactatc acattcctgc cttgacaggg gtaaagattc gtgctgagga 
53 gggattctgg ataagatccc caccttccaa gggctgaaat tcagtgatac 
55 gacttcgggc aatgtgttga tcagaatcgc cagcaacagt ttgctttcct 
5 7 gatgagcaac tgttgagtgc tctggtgatg ggagcaactg gagcagtggg 
59 tccagagatt tatcaacttt gttgtcaaac taggttttgg agtgtcacag 
61 tcatgactct ggtctctggg attccaatgg gcccaccccg gcttccactg 
63 ccagggagtt tactgatagt gctgaagcta aactgaagag cctggatttc 
65 ctgatttaaa ggatggaaac ttggaagctg gtagctagtg cctctctatc 
67 ttgcaccttg agacataatc taccttaaat agtgcatttt tttctcaggg 
69 gaacttgaat aaactctcct agcaaatgaa atctcacaat aagcattgag 
71 tgagccttaa aaagtcttat tttgtgaagg ggcaaaaact ctaggagtca 
73 tcattcattt cacagatttt tttgtggaga aatttctgtt tatatggatg 
75 aagaggaaaa ttgtaattga ttaattccat ctgtctttag gagctctcat 



alylation Pathways 



p3' 



gccaatgact 
tgtgaaagaa 
cctgagcgtc 
gctggatcag 
ccaacatgca 
gccatggacc 
cctgccattt 
gttgttggat 
agatctctta 
ttttggggtg 
cagttttgta 
accaaagcca 
cagaaagcct 
ctttctttca 
aaatcagggt 
aattttagat 
gtaccttttg 
caactctcag 
aaatggaatc 
tatctcggtc 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
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77 


tctggttcct aatcctattt taaagttgtc 


taattttaaa 


ccactataat atgtcttcat 


79 


tttaataaat attcatttgg aatctaggaa 


aactctgagc 


tactgcattt aggcaggcac 


81 


tttaatacca aactgtaaca tgtctcaact 


gtatacaact 


caaaatacac cagctcattt 


83 


ggctgctcag tctaactcta gaatggatgc 


ttttgaattc 


atttcgatg 






86 


<210> SEQ ID 


NO: 


2 




















87 


<211> LENGTH 


: 304 




















88 


<212> TYPE: 


PRT 






















89 


<213> ORGANISM: 


Homo sapiens 




















91 


<400> SEQUENCE: 


2 




















93 


Mpt Al a 


Phe 


Pro 


Lys Lys Lys : 


Leu ( 


Gin 


Gly 


Leu 


vai 


Ala , 


Ala 


Thr 


He 


94 


1 






5 






10 










15 




97 


"Pin T" Dt~/^ 

J. 111. IT i- O 


1*1 fci U 


Thr 


Glu Asn Gly Glu : 


He 


Asn 


Phe 


oer 


vai 


lie 


Gly Gin 


98 






20 




25 










3 0 






101 


TyiT Val 




Tyr 


Leu Val Lys 


Glu 


Gin 


Gly Val 


Lys 


Asn 


lie 


Phe 


Val 


102 




35 






40 










45 








105 




Thr 


Thr 


Gly Glu Gly Leu 


Ser 


Leu 


Ser 


val 


Ser 


CjIU 


Arg 


Arg 


106 


5 0 






55 










6 0 








109 


Gin V;^l 

wXll V CI. J. 


Ala 


Glu 


Glu Trp Val 


Thr 


Lys 


Gly 


Lys 


Asp 


Lys 


Leu 


Asp 


Gin 


110 


65 






70 








75 










80 


113 


Val lie 


He 


His 


Val Gly Ala 


Leu 


Ser 


Leu 


Lys 


CjlU 


Ser 


Gin 


Glu 


Leu 


114 








85 






90 










95 




117 


A1 a G1 n 


His 


Ala 


Ala Glu He 


Gly Ala Asp 


Gly 


I IB 


Ala 


Val 


He 


Ala 


118 






100 






105 










110 






121 


PiTO Phe 


Phe 


Leu 


Lys Pro Trp 


Thr 


Lys 


Asp 


He 


Leu 


lie 


Asn 


Phe 


Leu 


122 




115 






120 










12 5 








125 


Lvs Glu 


Val 


Ala 


Ala Ala Ala 


Pro 


Ala 


Leu 


Pro 


jrne 


Tyr 


Tyr 


Tyr 


His 


12 6 


130 






135 










T /I ri 
14U 










129 


lie Pro 


Ala 


Leu 


Thr Gly Val 


Lys 


He 


Arg Ala 


LjJ-U 


(jilU 


Leu 


Leu 


Asp 


130 


145 






150 








155 










160 


133 


Gly He 


Leu 


Asp 


Lys He Pro 


Thr 


Phe 


Gin 


Gly 


Leu 


Lys 


Phe 


Ser 


Asp 


134 








165 






170 










175 




137 


Thr Asp 


Leu 


Leu Asp Phe Gly Gin 


Cys 


Val 


Asp 


Gin 


Asn 


Arg 


Gin 


Gin 


138 






180 






185 










190 






141 


Gin Phe 


Ala 


Phe 


Leu Phe Gly Val 


Asp 


Glu 


Gin 


Leu 


Leu 


Ser 


Ala 


Leu 


142 




195 






200 










205 








145 


Val Met 


Gly Ala 


Thr Gly Ala 


Val 


Gly 


Ser 


Phe 


Val 


Ser 


Arg 


Asp 


Leu 


14 6 


210 






215 










220 










149 


Ser Thr 


Leu 


Leu 


Ser Asn Val 


Leu 


Glu 


Cys 


His 


Arg 


Pro 


Lys 


Pro 


Ser 


150 


225 






230 








235 










240 


153 


Leu Trp 


Ser 


Leu 


Gly Phe Gin 


Trp 


Ala 


His 


Pro 


Gly 


Phe 


His 


Cys 


Arg 


154 








245 






250 










255 




157 


Lys Pro 


Pro 


Gly 


Ser Leu Leu 


He 


Val 


Leu 


Lys 


Leu 


Asn 


Arg 


Ala 


Trp 


158 






260 






265 










270 




161 


He Ser 


Phe 


Leu 


Ser Leu He 


Arg 


Met 


Glu 


Thr 


Trp 


Lys 


Leu 


Val 


Ala 


162 




275 






280 










285 








165 


Ser Ala 


Ser 


Leu 


Ser Asn Gin 


Gly 


Phe 


Ala 


Pro 


Leu 


Arg 


His 


Asn 


Leu 


166 


290 






295 










300 








169 


<210> SEQ ID NO: 


: 3 




















170 


<211> LENGTH: 13 05 





















1260 
1320 
1380 
1429 



filc://C:\CRF4\Outhold\VsrI930440C.htin 



12/14/04 



RAW SEQUENCE LISTING DATE: 12/14/2004 

PATENT APPLICATION: US/09/930 , 440C TIME: 14:00:40 



Input Set : A:\03940077pa, txt 
Output Set: N:\CRF4\12142004\I930440C, raw 

171 <212> TYPE: DNA 

172 <213> ORGANISM: Homo sapiens 

174 <400> SEQUENCE: 3 

175 atggactcgg tggagaaggg ggccgccacc tccgtctcca 
177 cggggccggc cgccgaagct gcagcgcaac tctcgcggcg 
179 aagcccccgc acctggcagc cctaattctg gcccggggag 
181 aagaacatta agcacctggc gggggtcccg ctcattggct 
183 gattcagggg ccttccagag tgtatgggtt tcgacagacc 
185 gccaaacaat ttggtgcaca agttcatcga agaagttctg 
187 acctcactag atgccatcat agaatttctt aattatyata 
189 aatattcaag ctacttctyc atgtttacat cctactgatc 
191 attcgagaag aaggatatga ttctgktttc tctgttgtga 
193 agtgaaattc agaaaggagt tcgtgaagtg accgaacctc 
195 cggcctcgtc gacaagactg ggatggagaa ttatatgaaa 
197 aaaagacatt tgatagagat gggttacttg cagggtggaa 
199 cgagctgaac atagtgtgga tatagatgtg gatattgatt 
2 01 gtattaagat atggctattt tggcaaagag aagcttaagg 
203 aatattgatg gatgtctcac caatggccac atttatgtat 
205 atatcttatg atgtaaaaga tgctattggg ataagtttat 
2 07 gtgaggctaa tctcagaaag ggcctgttca aagcagacgc 
2 09 tgcaaaatgg aagtcagtgt atcagacaag ctagcagttg 
211 atgggcctgt gctggaaaga agtggcatat cttggaaatg 
213 ttgaagagag tgggcctaag tggcgctcct gctgatgcct 
215 gttggataca tttgcaaatg taatggtggc cgtggtgcca 
217 atttgcctac taatggaaaa agttaataat tcatgccaaa 

220 <210> SEQ ID NO: 4 

221 <211> LENGTH: 434 

222 <212> TYPE: PRT 

223 <213> ORGANISM: Homo sapiens 

226 <220> FEATURE: 

227 <221> NAME/KEY: misc_feature 

228 <222> LOCATION: (133).. (133) 

229 <223> OTHER INFORMATION: Xaa can be any naturally occurring amino acid 

231 <220> FEATURE: 

232 <221> NAME/KEY: misc_feature 

233 <222> LOCATION: (136).. (136) 

234 <223> OTHER INFORMATION: Xaa can be any naturally occurring amino acid 

236 <220> FEATURE: 

237 <221> NAME/KEY: misc_feature 

238 <222> LOCATION: (147).. (147) 

239 <223> OTHER INFORMATION: Xaa can be any naturally occurring amino acid 

241 <220> FEATURE: 

242 <221> NAME/KEY: misc__f eature 

243 <222> LOCATION: (169),. (169) 

244 <223> OTHER INFORMATION: Xaa can be any naturally occurring amino acid 
246 <400> SEQUENCE: 4 

248 Met Asp Ser Val Glu Lys Gly Ala Ala Thr Ser Val Ser Asn Pro Arg 

249 15 10 15 

2 52 Gly Arg Pro Ser Arg Gly Arg Pro Pro Lys Leu Gin Arg Asn Ser Arg 



acccgcgggg 


gcgaccgtcc 


60 


gccagggccg 


aggtgtggag 


120 


gcagcaaagg 


catccccctg 


180 


ggg^^^^ctgcg 


t9^99ccctg 


240 


atgatgaaat 


tgagaatgtg 


300 


aagt t t caaa 


agacagctct 


360 


atgaggktga 


cattgtagga 


420 


ttcaaaaagt 


tgcagaaatg 


480 


gacgccatca 


gtttcgatgg 


540 


tgaatttaaa 


tccagctaaa 


600 


atggctcatt 


ttattttgct 


660 


aaatggcata 


ctacgaaatg 


720 


ggcctattgc 


agagcaaaga 


780 


aaataaaact 


tttggtttgc 


840 


caggagacca 


aaaagaaata 


900 


taaagaaaag 


tggtattgag 


960 


tgtcttcttt 


aaaactggat 


1020 


tagatgaatg 


gagaaaagaa 


1080 


aagtgtctga 


tgaagagtgc 


1140 


gttcctacgc 


ccagaaggct 


1200 


tccgagaatt 


tgcagagcac 


1260 


aatag 




1305 
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253 



20 



25 



30 



256 Gly Gly Gin Gly Arg Gly Val Glu Lys Pro Pro His Leu Ala Ala Leu 

257 35 40 45 

260 lie Leu Ala Arg Gly Gly Ser Lys Gly lie Pro Leu Lys Asn lie Lys 

261 50 55 60 

264 His Leu Ala Gly Val Pro Leu lie Gly Trp Val Leu Arg Ala Ala Leu 

265 65 70 75 80 

268 Asp Ser Gly Ala Phe Gin Ser Val Trp Val Ser Thr Asp His Asp Glu 

269 85 90 95 

2 72 lie Glu Asn Val Ala Lys Gin Phe Gly Ala Gin Val His Arg Arg Ser 
273 100 105 110 

276 Ser Glu Val Ser Lys Asp Ser Ser Thr Ser Leu Asp Ala lie lie Glu 

277 115 120 125 

W--> 2 80 Phe Leu Asn Tyr Xaa Asn Glu Xaa Asp He Val Gly Asn He Gin Ala 
281 130 135 140 

W--> 2 84 Thr Ser Xaa Cys Leu His Pro Thr Asp Leu Gin Lys Val Ala Glu Met 
285 145 150 155 160 

W--> 2 88 He Arg Glu Glu Gly Tyr Asp Ser Xaa Phe Ser Val Val Arg Arg His 
289 165 170 175 

2 92 Gin Phe Arg Trp Ser Glu He Gin Lys Gly Val Arg Glu Val Thr Glu 
293 180 185 190 

2 96 Pro Leu Asn Leu Asn Pro Ala Lys Arg Pro Arg Arg Gin Asp Trp Asp 
297 195 200 205 

3 00 Gly Glu Leu Tyr Glu Asn Gly Ser Phe Tyr Phe Ala Lys Arg His Leu 
301 210 215 220 

3 04 He Glu Met Gly Tyr Leu Gin Gly Gly Lys Met Ala Tyr Tyr Glu Met 
305 225 230 235 240 

308 Arg Ala Glu His Ser Val Asp He Asp Val Asp He Asp Trp Pro He 

309 245 250 255 

312 Ala Glu Gin Arg Val Leu Arg Tyr Gly Tyr Phe Gly Lys Glu Lys Leu 

313 260 265 270 

316 Lys Glu He Lys Leu Leu Val Cys Asn He Asp Gly Cys Leu Thr Asn 

317 275 280 285 

320 Gly His He Tyr Val Ser Gly Asp Gin Lys Glu He He Ser Tyr Asp 

321 290 295 300 

324 Val Lys Asp Ala He Gly He Ser Leu Leu Lys Lys Ser Gly He Glu 

325 305 310 315 320 

328 Val Arg Leu He Ser Glu Arg Ala Cys Ser Lys Gin Thr Leu Ser Ser 

329 325 330 335 

332 Leu Lys Leu Asp Cys Lys Met Glu Val Ser Val Ser Asp Lys Leu Ala 

333 340 345 350 

33 6 Val Val Asp Glu Trp Arg Lys Glu Met Gly Leu Cys Trp Lys Glu Val 
337 355 360 365 

340 Ala Tyr Leu Gly Asn Glu Val Ser Asp Glu Glu Cys Leu Lys Arg Val 

341 370 375 380 

344 Gly Leu Ser Gly Ala Pro Ala Asp Ala Cys Ser Tyr Ala Gin Lys Ala 

345 385 390 395 400 

348 Val Gly Tyr He Cys Lys Cys Asn Gly Gly Arg Gly Ala He Arg Glu 

349 405 410 415 
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352 Phe Ala Glu His He Cys Leu Leu Met Glu Lys 

353 420 425 
3 56 Gin Lys 

360 <210> SEQ ID NO: 5 

361 <211> LENGTH: 1080 
3 62 <212> TYPE: DNA 

363 <213> ORGANISM: Homo sapiens 
365 <400> SEQUENCE: 5 

3 66 atgccgctgg agctggagct gtgtcccggg cgctgggtgg 
3 68 atcattgccg agatcggcca gaaccaccag ggcgacctgg 
3 70 cgcatggcca aggagtgtgg ggctgattgt gccaagttcc 
3 72 aagtttaatc ggaaagcctt ggagaggcca tacacctcga 
374 tacggggagc acaaacgaca tctggagttc agccatgacc 
3 76 tacgccgagg aggttgggat cttcttcact gcctctggca 
378 ttcctgcatg aactgaatgt tccatttttc aaagttggat 
3 80 ccttatctgg aaaagacagc caaaaaaggt cgcccaatgg 
3 82 tcaatggaca ccatgaagca agtttatcag atcgtgaagc 
384 ttcttgcagt gtaccagcgc atacccgctc cagcctgagg 
3 86 tcggaatatc agaagctctt tcctgacatt cccatagggt 
3 88 atagcgatat ctgtggccgc agtggctctg ggggccaagg 
3 90 ttggacaaga cctggaaggg gagtgaccac tcggcctcgc 
3 92 gagctggtgc ggtcagtgcg tcttgtggag cgtgccctgg 
3 94 ctgccctgtg agatggcctg caatgagaag ctgggcaagt 

3 96 attccggaag gcaccattct aacaatggac atgctcaccg 
398 gcctatcctc ctgaagacat ctttaatcta gtgggcaaga 
400 gaggatgaca ccatcatgga agaattggta gataatcatg 

403 <210> SEQ ID NO: 6 

404 <211> LENGTH: 359 

405 <212> TYPE: PRT 

4 06 <213> ORGANISM: Homo sapiens 
408 <400> SEQUENCE: 6 

410 Met Pro Leu Glu Leu Glu Leu Cys Pro Gly Arg 

411 15 10 

414 His Pro Cys Phe He He Ala Glu He Gly Gin 

415 20 25 

418 Leu Asp Val Ala Lys Arg Met He Arg Met Ala 

419 35 40 

422 Asp Cys Ala Lys Phe Gin Lys Ser Glu Leu Glu 

423 50 55 

426 Lys Ala Leu Glu Arg Pro Tyr Thr Ser Lys His 

427 65 70 75 
43 0 Tyr Gly Glu His Lys Arg His Leu Glu Phe Ser 
431 85 90 

434 Glu Leu Gin Arg Tyr Ala Glu Glu Val Gly He 

435 100 105 

43 8 Gly Met Asp Glu Met Ala Val Glu Phe Leu His 
439 115 120 

442 Phe Phe Lys Val Gly Ser Gly Asp Thr Asn Asn 

443 130 135 



Val Asn Asn Ser Cys 
430 



gcgg9<^aaca 
acgtagccaa 
agaagagtga 
agcattcctg 
agtacaggga 
tggatgagat 
ctggagacac 
tgatctccag 
ccctcaaccc 
acgtcaacct 
attctgggca 
tgttggaacg 
tggagcctgg 
gctccGcaac 
ctgtggtggc 
tgaaggtggg 
aggtcctggt 
gcaaaaaaat 



cccgtgcttc 
gcgcatgatc 
gctagaattc 
ggggaagacg 
gctgcagagg 
ggcagttgaa 
taataatttt 
tgggatgcag 
caacttctgc 
gcgggtcatc 
tgaaacaggc 
tcacataact 
agaactggcc 
caagcagctg 
caaagtgaaa 
tgagcccaaa 
cactgttgaa 
caagtcttaa 



Trp Val 

Asn His 

Lys Glu 

45 
Phe Lys 
60 

Ser Trp 

His Asp 

Phe Phe 

Glu Leu 
125 
Phe Pro 
140 



Gly Gly Gin 
15 

Gin Gly Asp 
30 

Cys Gly Ala 

Phe Asn Arg 

Gly Lys Thr 
80 

Gin Tyr Arg 

95 

Thr Ala Ser 
110 

Asn Val Pro 
Tyr Leu Glu 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
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Please Note; 

Use of n and/or Xaa have been detected in the Sequence Listing. Please review the 
Sequence Listing to ensure that a corresponding explanation is presented in the <220> 
to <223> fields of each sequence A/hich presents at least one n or Xaa. 



Seq#:9; N Pos . 1 , 3 , 6 , 12 , 13 15 , 1 8 
Seq#:10; N Pos. 3,4,6,9,10,11,12,15,17,18 
Seq#:ll; N Pos. 3,4,6,9,10,11,12,15,17,18 
Seq#:12; N Pos . 2,3,6,9,12,15,16,17,18 

Invalid <213> Response: 

Use of "Artificial" only as "<213> Organism" response is incomplete/ 

per 1.823(b) of New Sequence Rules. Valid response is Artificial Sequence. 

Seq#: 9, 10, 11 , 12 , 13 , 14 , 15 , 16 
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L: 


280 


M: 


341 


W 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


:4 after pos . 


128 


L: 


284 


M: 


341 


W 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


:4 after pos. 


144 


L: 


288 


M- 


341 


W 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


:4 after pos. 


160 


L- 


686 


M- 


341 


W 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


; 9 after pos . 


0 


L: 


750 


M 


341 


W 


(46) 


"n" 


or 


"Xaa" 


used. 


for 


SEQ 


ID# 


:10 after pos 


: 0 


L: 


814 


M 


341 


W 


(46) 


"n" 


or 


"Xaa" 


used. 


for 


SEQ 


ID# 


: 11 after pos 


: 0 


L 


873 


M 


341 


W 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


: 12 after pos 


. : 0 
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